Using SAS® Procedures FREQ, GENMOD, LOGISTIC, and PHREG to Estimate Adjusted Relative Risks – A Case Study

نویسنده

  • Jiming Fang
چکیده

We present nine methods to compute an adjusted relative risk (RR). These methods evolved over the past 25 years (1985–2010) via SAS/STAT® procedures: FREQ, GENMOD, LOGISTIC, and PHREG. We also compare the strengths and limitations of these methods, using an observational cohort study for illustration. INTRODUCTION The relative risk (RR) is a common measure of the effect of treatment or exposure on a dichotomous outcome in cohort studies. Researchers are increasingly using observational studies to estimate the effect of treatment on outcomes. However, unlike randomized controlled trials, treated subjects in non-randomized studies often differ systematically from untreated subjects. The effect of treatment on outcomes cannot be compared directly between groups. Therefore, statistical methods must be used to adjust for systematic differences when estimating the effect of treatment on outcomes. In the present paper, we illustrate 9 methods to compute adjusted relative risks which have been developed in a quarter of a century via 4 different SAS/Stat® procedures: FREQ, GENMOD, LOGISTIC, and PHREG. We will also compare the strengths and limitations of these methods based on an observational cohort study using data from the Registry of Canadian Stroke Network. Study Cohort We conducted a study to investigate the impact of follow-up at a secondary prevention clinic (SPC) on 1-year mortality in stroke patients. The study cohort was taken from the Registry of Canadian Stroke Network (RCSN), which includes patients seen at all 11 stroke centers in Ontario, Canada between July 2003 and March 2006. Data concerning the date of stroke onset and hospital arrival, stroke type, comorbidities, stroke severity, and outcomes at discharge were abstracted from each patient’s chart by trained nurses using custom RCSN data entry software. The risk of 1-year death following stroke onset was determined through linkages to a provincial administrative database. The study cohort consisted of 9074 ischemic or transient ischemic attack (TIA) patients who were alive at discharge. Of these, 4036 patients were referred to a secondary prevention clinic follow-up (SPC=1), and 5038 were not (SPC=0). Patients with SPC were significantly different from those without SPC in terms of their demographic and clinical characteristics (Table 1, P-values <0.05 highlighted in red). Crude RR using Proc Freq The crude RR provides a measure of the overall association between the risk factor and the outcome, e.g., SPC and 1-year mortality in the present study. It can be obtained easily from Proc Freq using RelRisk option. Proc Freq data=StudyCohort; Tables SPC*Death_1year / RelRisk; Run; The 1-year mortality rates in SPC patients and non-SPC patients were 6.5% and 14.4%, respectively. The crude RR is 0.454 (95% CI: 0.397-0.519), suggesting the 1-year mortality rate for SPC patients was 54.6% lower than for non-SPC patients. However, due to the differences in baseline characteristic (Table 1), we must run multivariate analyses to adjust the RR for the impact of other potential factors that may be related to SPC follow-up. Adjusted RR using Proc Freq – Stratified Mantel-Haenszel We can use a stratified Mantel-Haenszel Chi-square statistic to control for the other categorical factors, for example, ambulance transportation and hospital admission. This adjusted RR may identify the role of the risk factor of interest (SPC) after the risk from other factors(s) has been statistically removed (Greenland & Robins 1985). Here is Mantel-Haenszel test: Proc Freq data= StudyCohort; Tables Ambulance*Admission*SPC*Death_1year / RelRisk; Run; Statistics and Data Analysis SAS Global Forum 2011

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Ways to Detect Differential Item Functioning in SAS

Differential item functioning (DIF), as an assessment tool, has been widely used in quantitative psychology, educational measurement, business management, and insurance and healthcare industries. The purpose of DIF analyses is to detect response differences of items in questionnaires, rating scales, or tests across different subgroups (e.g., gender), while controlling for ability level. There a...

متن کامل

184-31: Fixed Effects Regression Methods in SAS®

Fixed effects regression methods are used to analyze longitudinal data with repeated measures on both independent and dependent variables. They have the attractive feature of controlling for all stable characteristics of the individuals, whether measured or not. This is accomplished by using only within-individual variation to estimate the regression coefficients. This paper surveys the wide va...

متن کامل

SUGI 27: How to Use SAS(r) for Logistic Regression with Correlated Data

Many study designs in applied sciences give rise to correlated data. For example, subjects are followed over time, are repeatedly treated under different experimental conditions, or are observed in logical units (e.g. clinics, families, litters). Statistical methods for regression analysis for this kind of data with continuous responses are quite established and the SAS system offers a variety ...

متن کامل

Practice of Epidemiology Estimating Model-Adjusted Risks, Risk Differences, and Risk Ratios From Complex Survey Data

There is increasing interest in estimating and drawing inferences about risk or prevalence ratios and differences instead of odds ratios in the regression setting. Recent publications have shown how the GENMOD procedure in SAS (SAS Institute Inc., Cary, North Carolina) can be used to estimate these parameters in non-population-based studies. In this paper, the authors show how model-adjusted ri...

متن کامل

Performing Exact Logistic Regression with the SAS System — Revised 2009

Exact logistic regression has become an important analytical technique, especially in the pharmaceutical industry, since the usual asymptotic methods for analyzing small, skewed, or sparse data sets are unreliable. Inference based on enumerating the exact distributions of sufficient statistics for parameters of interest in a logistic regression model, conditional on the remaining parameters, is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011